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A Factorization Scheme for Some Discrete Hartley Transform Matrices 
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Abstract 

Discrete transforms such as the discrete Fourier transform (DFT) and the discrete Hartley transform (DHT) are 
important tools in numerical analysis, signal processing, and statistical methods. The successful application of trans¬ 
form techniques relies on the existence of efficient fast transforms. In this paper some fast algorithms are derived. The 
theoretical lower bound on the multiplicative complexity for the DFT/DHT are achieved. The approach is based on 
the factorization of DHT matrices. Algorithms for short blocklengths such as N G {3,5,6,12,24} are presented. 
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1 Introduction 


Discrete transforms defined over finite or infinite fields have been playing a relevant role in numerical analysis. A striking 
example is the discrete Fourier Transform, which has found applications in several areas. Another relevant example 
concerns the discrete Hartley transform (DHT) ||T]]. the discrete version of the integral transform introduced by Hartley 
in (21. Besides its numerical side appropriateness, the DHT has proven over the years to be a powerful tool (3] [HQ. 
A decisive factor for applications of the DFT has been the existence of fast transforms (FT) for computing it (6]. Fast 
Hartley transforms also exist and are deeply connected to the DHT applications HQ. Recent promising applications 
of discrete transforms concern the use of finite field Hartley transforms |Q to design digital multiplex systems, efficient 
multiple access systems ITOl and multilevel spread spectrum sequences tfTTIl . 

Discrete transforms presenting a low multiplicative complexity have been an object of interest for a long time. Very 
efficient algorithms such as the Prime Factor Algorithm (PFA) or Winograd Fourier Transform Algorithm (WFTA) have 
also been used fl2l fl3l . The minimal multiplicative complexity, ju(-), of the one-dimensional DFT for all possible 
sequence lengths, N, can be computed by converting the DFT into a set of multi-dimensional cyclic convolutions. A 
lower bound on the multiplicative complexity of a DFT is given in lfl4l Theorem 5.4, p. 98]. The values of Hdft{N) for 
some short blocklengths are given in Table 1 (some local minima of /./). 

The discrete Hartley transform of a signal v;, i = 0. 1.2.... .V — I is defined by 
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Table 1: Minimal multiplicative complexity for computing a DFT of length N 


N 

Fdft {N) 

3 

1 

5 

3 

6 

2 

12 

4 

24 

12 


where cas(x) = cos (x) + sin(w) is the “cosine and sine” Hartley symmetric kernel. 

In this paper, some FTs are presented, which meet the minimal multiplicative complexity. There is a simple rela- 

JF Jff 

tionship between the DHT and the DFT of a given real discrete signal v. If v, <—> is a DFT pair and v,- <—> FI\ is the 

corresponding DHT pair, then 0 we have: 


and 


F k 


{H k + H N _ k )-j{H k -H N _ k ) , 


for k = 0,1,... ,N — 1. 

Therefore, a fast algorithm for the DHT is also a fast algorithm for the DFT and vice-versa lfT 4 i Corollary 6 . 9 ]. 
Besides being a real transform, the DHT is also involutionary, i.e., the kernel of the inverse transform is the same as the 
one of the direct transform (self-inverse transform). Since the DHT is a more symmetrical version of a discrete transform, 
this symmetry is exploited so as to derive a FT that requires the minimal number of real floating point multiplications. 


2 Computing the 3 -point DHT 


Let v = (vo,vi,V 2 ) r F —> V = (Vo. V), VF ) 7 be discrete Hartley transform pair of blocklength 3. The matrix formulation 
of this transform corresponds to V = H 3 V, where H 3 is given by 


H 3 


1 1 1 

1 3 / 3-1 vT+i 

2 2 

1 3/3+1 3/3-1 

22 


( 2 ) 


Note that the irrational elements of H 3 have the same decimal part, i.e., except from their integer part, they have 
the same absolute value (see that (V3 — l)/2 ~ .366... and (\/3 -F l)/2 ~ 1.366...). So let us make the following 
decomposition: 


' 1 

1 

1 




1 

3/3-1 

vT-i 

+ 


-1 

2 

2 


1 

3/3—1 

3/3—1 



-1 

2 

2 J 




H'3 


( 3 ) 


Since the 2 nd and 3 rd columns of the new matrix H'3 have the same elements (taken in absolute values), we can 
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Figure 1: 3-point DHT fast algorithm diagram. 

consider new variables v\ + V 2 and vi — V 2 . Thus, this substitution yields the following matrix equation: 


Vo 


' 1 0 o' 


vo 

Vi + V2 

= 

0 1 1 


vi 

1 

CN 

> 

1 


0 1 -1 


. v 2 . 


So the transform can be expressed by: 



i i 



vo 


y = 

i 

vT-t 

2 


Vl +v 2 

+ 


i 

i/3-l 

2 J 


1 

<N 

1 

? 

_1 



-v- 

H" 

Observe that the new matrix H" can be splitted in two new matrices, as shown below: 


(4) 


(5) 



1 

1 


1 

H" = 

1 

1 


1 


1 

-1 


\/3—1 

L 2 J 


Joining the above equations in a single statement, we have that: 


( \ 


1 1 


1 


l 




1 1 


1 


l l 

+ 

-i 


1 -1 


a 


1 - 1 


- 1 


c 


B 


A 


L 

/ 


( 6 ) 


(V) 


where a = (\/3 — l)/2. 

One can recognize the pre-addition matrix A, the multiplication matrix B and the post-addition matrix C. This 
algorithm introduces a new kind of matrices denoted by L. We will name them “layer matrix”. 

As the notation was explained, we can now express the entire algorithm compactly by the following equation: 

V = (CBA + L)v. ( 8 ) 

This algorithms has only one nontrivial multiplication and 7 additions. In Figure [Q there is a schematic diagram for this 
transform. 
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3 Computing the 5-point DHT 




Let v i —-> V be a 5-point DHT pair. The corresponding matrix formulation is now V = H 5 V, where 


H, 


11111 

1 abed 
1 b d a c 
1 c a d b 
1 d c b a 


( 9 ) 


where 


= 1/4 (V5-1 + V2J5 + V5 , 


6 = -l/4 V5 + l-V2\/5-V5 , 


c = —1/4 r x/5 + 1 + V2y 5 — v/5 j , 
d= 1/4 ^\/5 — 1 — V2\J5 + \/5^ • 

We can combine the 2 nd and the 4 th columns as well as the 3 rd and the 5 th ones using a Hadamard transform unit of 
length 2 (a butterfly). As a result of the process of combining columns, we achieve the following matrix factorization: 


V = C 3 C 2 CiBA 2 Aiv, 


( 10 ) 


where A] is the pre-addition matrix, B is the multiplication matrix and C, arc the post-addition matrices. This matrices 
are detailed below: 


A, = 

B = 


C 2 = 


ri 1 

1 -1 

1 -1 
1 -1 

L 1 -1J 


r 1 -1 

C5 

1 

1 

L lJ 


1 1 
1 -1 


lJ 


a 2 = 


c, = 


C 3 = 


1 1 
1 -I 


* / 
f-e 


1 1 
1 -1 


1 1 
1 1 
1 -1 

1 - 1 . 


where e = \fl\J 5 — v/5/2 e / = \fl\/ 5 + a/5/2. 

Now let us work in the matrix A 2 . Note that it contains multiplicative elements, namely e and /, and four additions. 
We can go further and factorize this matrix in such a way that purely multiplicative and additive matrices appeal - . This 
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Figure 2: 5-point DHT fast algorithm. 


can be done by the following method: 


A-2 


' 1 


' 1 

1 1 


1 

1 - 


1 

1 


1 1 

1 


— 1 1 


1 


1 

1 


1 

1 


1 

f + e 


1 

f 


-1 1 

1 

'-s 

1 

1 _ 


1 


Thus, in Equation [TO] one should replace the matrix At by its factorization. The full decomposition of the original 
transform matrix H 5 is then achieved. 

The arithmetic complexity of this algorithm is 3 multiplications and 17 additions. The schematic diagram is depicted 
in Figure |2] 


4 Computing a 6 -point DHT 

Let us now consider v <—> V the transform pah - related by the Hartley matrix H 6 , where 


1 

1 

1 

1 

1 

1 

1 

%/3+t 

v/3-1 

-1 

x/3+1 

x/3-1 

2 

2 

2 

2 

1 

V3-1 

V3+1 

1 

\/3—1 

V3+1 

2 

2 

2 

2 

1 

-1 

1 

-1 

1 

-1 

1 

V3+1 

%/3 —1 

1 

V3+1 

■\/3— 1 

2 

2 

2 

2 

1 

vT-i 

V3+1 

-1 

v/3-1 

C3+1 

2 

2 

2 
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Using the Hadamard transform to combine the 1 st and the 4 th columns, the 2 nd and the 5 th columns, and finally, the 
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3 rd and the 6 th columns, the matrix algorithm can reduced to: 


y = 


1 

1 

1 



1 

V3-1 

2 

1 

vT+t 

2 

V3+1 

2 

V3-1 

2 
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-1 

1 

1 

C3+1 

2 

V3-1 

2 

1 

i/3-l 

n/3+1 



2 

2 J 


Aiv, 


HI 


(13) 


where Ai is the first pre-addition matrix. The matrix Ai is detailed below: 


1 

1 

1 

1 

1 

1 

1 

-1 

1 

-1 

1 

-1 


(14) 


The form of the matrix Ai is the same for all transforms of even blocklength. The explanation to this fact is given by 
the lemma below. 


Lemma 1 The pre-addition matrix Ai of a Hadamard decomposition of a even blocklength DHT has the following 
construction: 


:>|tN 

i_ 

In 

2 

_i 

-In 

2 


= Had-> <g> Iv 

2 


(15) 


where Had 2 is the Hadamard matrix, <g> is the direct product and I n is a identity matrix of order N /2. 

Proof: The elements of the Hartley matrix, H,v, are governed by this property: h ki+ N = (— 1 fh^p where h^j is the 
(ft,//element of the transform matrix. This property can be derived from the cas(-) arcs addition rule |3| cas (a — b) = 
cos(ft)cas(a) — sin (ft) cas'(a), where cas'(a) = cos(a) — sin(o). Consequently we have that: 


%i+ f 


n = cas 


2nk(i + %) 
N 


( 2 nki 
= cas ———|- nk 

k (2nki 


N 

= (—l/cas 


V N 


= (-!)%• 


Therefore the / th and the (i + ^) th columns have the same absolute value, which allows us to combine them via 
Hadamard transform. New variables arise from this technique: (v,- + v i+ iv) and (v,- — v (+ jv), / = 0.... ,7V/2 — 1. These 
new variables are generated by the matrix Ai. □ 

Using the same strategy described in the 3-point DHT algorithm, we can take aside the integer part of some elements 
of the matrix H/ This procedure yields to a new more “balanced” matrix. These steps are represented by the following 
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equation: 



See that V = (H" + L)A lV . 

Carrying out the procedure of combining columns which “agree”, we will have the next pre-addition matrix A 2 : 


1 

1 1 


1 1 
1 -1 

This makes the matrix H' 6 be written as H' 6 = H" • A 2 , as seen in this equation: 



H'" 


(16) 


(17) 
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Figure 3: 6-point DHT fast algorithm diagram. 

Table 2: Arithmetic complexity for the proposed 12- and 24-point DHT fast algorithm. The function a(N ) returns the 
additive complexity of the implementation. 


N 

At(A0 

a (N) 

12 

4 

52 

24 

12 

138 


Now observe that the factorization of H"' yields the multiplication matrix, B, and the post-additions matrix, C. 


’ll 


1 

1 1 


1 

1 1 


a 

1 -1 


1 

1 -1 


a 

1 -1 


1 


C B 


(18) 


where a = (\/3 — 1 )/2. 

We have then completed the algorithm, and it can be represented by: 

V = (CBA 2 + L)A lV . (19) 

This algorithm has two multiplications and 20 additions and is depicted in Figure [3] 

5 Computing the 12- and the 24-point DHT 

The procedure used to derive the 3- and 6-point DHT fast transforms can be extended to other blocklengths, such as 12 
and 24. We derived these algorithms and achieved the arithmetic complexity showed in Table [2] 

In Figure |4] we see a diagram of the 24-point DHT fast transform, where the shorter transforms (3-, 6-, 12-point) are 
embedded. 

The algorithms proposed so far can be described in a general framework according to the following proposition. 
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Figure 4: 24-point DHT fast algorithm diagram (derivations omitted). Shorter transforms are embedded. 
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Proposition 1 The DHT decomposition has the following general formulation 


V = ^(((C„B ii A„ + L„_i)C„_iB„_jA„_i -bL 2 )C 2 B 2 A 2 + Li^CiBiAi +L 0 ^ v, (20) 

where n is the number of “layers” in the decomposition. □ 

6 Conclusions 

Short blocklength DHT fast algorithms that achieve the lower bound on the multiplicative complexity were derived. 
Low values for additive complexity were also found. These algorithms can be implemented in digital signal processors 
capable of low-power consumption. 
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